1,446 research outputs found

    Efficient Search for Diverse Coherent Explanations

    Get PDF
    This paper proposes new search algorithms for counterfactual explanations based upon mixed integer programming. We are concerned with complex data in which variables may take any value from a contiguous range or an additional set of discrete states. We propose a novel set of constraints that we refer to as a "mixed polytope" and show how this can be used with an integer programming solver to efficiently find coherent counterfactual explanations i.e. solutions that are guaranteed to map back onto the underlying data structure, while avoiding the need for brute-force enumeration. We also look at the problem of diverse explanations and show how these can be generated within our framework.Comment: FAT* 201

    On Multilingual Training of Neural Dependency Parsers

    Full text link
    We show that a recently proposed neural dependency parser can be improved by joint training on multiple languages from the same family. The parser is implemented as a deep neural network whose only input is orthographic representations of words. In order to successfully parse, the network has to discover how linguistically relevant concepts can be inferred from word spellings. We analyze the representations of characters and words that are learned by the network to establish which properties of languages were accounted for. In particular we show that the parser has approximately learned to associate Latin characters with their Cyrillic counterparts and that it can group Polish and Russian words that have a similar grammatical function. Finally, we evaluate the parser on selected languages from the Universal Dependencies dataset and show that it is competitive with other recently proposed state-of-the art methods, while having a simple structure.Comment: preprint accepted into the TSD201

    SeNA-CNN: Overcoming Catastrophic Forgetting in Convolutional Neural Networks by Selective Network Augmentation

    Get PDF
    Lifelong learning aims to develop machine learning systems that can learn new tasks while preserving the performance on previous learned tasks. In this paper we present a method to overcome catastrophic forgetting on convolutional neural networks, that learns new tasks and preserves the performance on old tasks without accessing the data of the original model, by selective network augmentation. The experiment results showed that SeNA-CNN, in some scenarios, outperforms the state-of-art Learning without Forgetting algorithm. Results also showed that in some situations it is better to use SeNA-CNN instead of training a neural network using isolated learning.info:eu-repo/semantics/publishedVersio

    Spectroscopy of z ~ 7 candidate galaxies: using Lyman α to constrain the neutral fraction of hydrogen in the high-redshift universe

    Get PDF
    Following our previous spectroscopic observations of z > 7 galaxies with Gemini/Gemini Near Infra-Red Spectrograph (GNIRS) and Very Large Telescope (VLT)/XSHOOTER, which targeted a total of eight objects, we present here our results from a deeper and larger VLT/FOcal Reducer and Spectrograph (FORS2) spectroscopic sample of Wide Field Camera 3 selected z > 7 candidate galaxies. With our FORS2 setup we cover the 737–1070 nm wavelength range, enabling a search for Lyman α in the redshift range spanning 5.06–7.80. We target 22 z-band dropouts and find no evidence of Lyman α emission, with the exception of a tentative detection (<5σ, which is our adopted criterion for a secure detection) for one object. The upper limits on Lyman α flux and the broad-band magnitudes are used to constrain the rest-frame equivalent widths for this line emission. We analyse our FORS2 observations in combination with our previous GNIRS and XSHOOTER observations, and suggest that a simple model where the fraction of high rest-frame equivalent width emitters follows the trend seen at z = 3-6.5 is inconsistent with our non-detections at z ∼ 7.8 at the 96 per cent confidence level. This may indicate that a significant neutral H I fraction in the intergalactic medium suppresses Lyman α, with an estimated neutral fraction χHI∼0.5, in agreement with other estimates

    Improving data-driven global weather prediction using deep convolutional neural networks on a cubed sphere

    Full text link
    We present a significantly-improved data-driven global weather forecasting framework using a deep convolutional neural network (CNN) to forecast several basic atmospheric variables on a global grid. New developments in this framework include an offline volume-conservative mapping to a cubed-sphere grid, improvements to the CNN architecture, and the minimization of the loss function over multiple steps in a prediction sequence. The cubed-sphere remapping minimizes the distortion on the cube faces on which convolution operations are performed and provides natural boundary conditions for padding in the CNN. Our improved model produces weather forecasts that are indefinitely stable and produce realistic weather patterns at lead times of several weeks and longer. For short- to medium-range forecasting, our model significantly outperforms persistence, climatology, and a coarse-resolution dynamical numerical weather prediction (NWP) model. Unsurprisingly, our forecasts are worse than those from a high-resolution state-of-the-art operational NWP system. Our data-driven model is able to learn to forecast complex surface temperature patterns from few input atmospheric state variables. On annual time scales, our model produces a realistic seasonal cycle driven solely by the prescribed variation in top-of-atmosphere solar forcing. Although it is currently less accurate than operational weather forecasting models, our data-driven CNN executes much faster than those models, suggesting that machine learning could prove to be a valuable tool for large-ensemble forecasting.Comment: Manuscript submitted to Journal of Advances in Modeling Earth System

    Assessment of SARS-CoV-2 tests costs and reimbursement tariffs readjustments during the COVID-19 pandemic.

    Get PDF
    While laboratories have been facing limited supplies of reagents for diagnostic tests throughout the course of the COVID-19 pandemic, national and international health plans, as well as billing costs, have been constantly adjusted in order to optimize the use of resources. We aimed to assess the impact of SARS-CoV-2 test costs and reimbursement tariff adjustments on diagnostic strategies in Switzerland to determine the advantages and disadvantages of different costs and resource saving plans. We specifically assessed the cost of diagnostic SARS-COV-2 RT-PCR using five different approaches: i) in-house platform, ii) cobas 6800® (Roche, Basel, Switzerland), iii) GeneXpert® SARS-CoV-2 test (Cepheid, Sunnyvale, CA, USA), iv) VIASURE SARS-CoV-2 (N1 + N2) Real-Time PCR Detection Kit for BD MAX™ (Becton Dickinson, Franklin Lake, NJ, USA), v) cobas® Liat® SARS-CoV-2 &amp; Influenza A/B (Roche, Basel, Switzerland). We compared these costs to the evolution of the reimbursement tariffs. The cost of a single RT-PCR test varied greatly (as did the volume of tests performed), ranging from as high as 180 CHF per test at the beginning of the pandemic (February to April 2020) to as low as 82 CHF per test at the end of 2020. Depending on the time period within the pandemic, higher costs did not necessarily mean greater benefits for the laboratories. The costs of molecular reagents for rapid tests were higher than of those for classic RT-PCR platforms, but the rapid tests had reduced turnaround times (TATs), thus improving patient care and enabling more efficient implementation of isolation measures, as well as reducing the burden of possible nosocomial infections. At the same time, there were periods when the production or distribution of these reagents was insufficient, and only the use of several different molecular platforms allowed us to sustain the high number of tests requested. Cost-saving plans need to be thoroughly assessed and constantly adjusted according to the epidemiological situation, the clinical context and the national resources in order to always guarantee that the highest performing diagnostic solutions are available. Not all cost-saving strategies guarantee good analytical performance

    Problematising international placements as a site of intercultural learning

    Get PDF
    This paper theorises some of the learning outcomes of a three-year project concerning student learning in international social work placements in Malaysia. The problematic issue of promoting cultural and intercultural competence through such placements is examined, where overlapping hegemonies are discussed in terms of isomorphism of social work models, that of the nation state, together with those relating to professional values and knowledge, and the tyrannies of received ideas. A critical discussion of cultural competence as the rationale for international placements is discussed in terms of the development of the graduating social worker as a self-reflexive practitioner. The development of sustainable international partnerships able to support student placement and the issue of non-symmetrical reciprocation, typical of wide socio-economic differentials across global regions, is additionally discussed

    Knowledge Distillation for Multi-task Learning

    Get PDF
    Multi-task learning (MTL) is to learn one single model that performs multiple tasks for achieving good performance on all tasks and lower cost on computation. Learning such a model requires to jointly optimize losses of a set of tasks with different difficulty levels, magnitudes, and characteristics (e.g. cross-entropy, Euclidean loss), leading to the imbalance problem in multi-task learning. To address the imbalance problem, we propose a knowledge distillation based method in this work. We first learn a task-specific model for each task. We then learn the multi-task model for minimizing task-specific loss and for producing the same feature with task-specific models. As the task-specific network encodes different features, we introduce small task-specific adaptors to project multi-task features to the task-specific features. In this way, the adaptors align the task-specific feature and the multi-task feature, which enables a balanced parameter sharing across tasks. Extensive experimental results demonstrate that our method can optimize a multi-task learning model in a more balanced way and achieve better overall performance.Comment: We propose a knowledge distillation method for addressing the imbalance problem in multi-task learnin

    Modern slavery in business: The sad and sorry state of a non-field

    Get PDF
    “Modern slavery,” a term used to describe severe forms of labor exploitation, is beginning to spark growing interest within business and society research. As a novel phenomenon, it offers potential for innovative theoretical and empirical pathways to a range of business and management research questions. And yet, development into what we might call a “field” of modern slavery research in business and management remains significantly, and disappointingly, underdeveloped. To explore this, we elaborate on the developments to date, the potential drawbacks, and the possible future deviations that might evolve within six subdisciplinary areas of business and management. We also examine the value that nonmanagement disciplines can bring to research on modern slavery and business, examining the connections, critiques, and catalysts evident in research from political science, law, and history. These, we suggest, offer significant potential for building toward a more substantial subfield of research
    corecore